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Abstract 

We study the problem of computing the upper bound of the discrete Frechet distance for imprecise input, and 
prove that the problem is NP-hard. This solves an open problem posed in 2010 by Ahn et al. If shortcuts are 
allowed, we show that the upper bound of the discrete Frechet distance with shortcuts for imprecise input can be 
computed in polynomial time and we present several efficient algorithms. 


1 Introduction 

The Frechet distance is a natural measure of similarity between two curves Q. The Frechet distance between two 
curves is often referred to as the “dog-leash distance”. Imagine a dog and its handler are walking on their respective 
curves, connected by a leash, and they both can control their speed but cannot walk back. The Frechet distance 
of these two curves is the minimum length of any leash necessary for the dog and the handler to move from their 
starting points on the two curves to their respective endpoints. Alt and Godau Q presented an algorithm to compute 
the Frechet distance between two polygonal curves of n and m vertices in 0(nm log (nm)) time. There has been 
a lot of applications using the Frechet distance to do pattern/curve matching. For instance, Frechet distance has 
been extended to graphs (maps) ||6l[T0l, to piecewise smooth curves ll26l . to simple polygons ifTTIl . to surfaces Q, to 
network distance ifT^ . and to the case when there is a speed limit |[24ll . etc. 

On the other hand, Frechet distance is sensitive to local errors, a small local change could change the Frechet 
distance greatly. In order to handle this kind of outliers, Driemel and Har-Peled llT3l introduced the Frechet distance 
with shortcuts. 

A slightly simpler version of the Frechet distance is the discrete Frechet distance, where only the vertices of 
polygonal curves are considered. In terms of using a symmetric example, we could imagine that two frogs, connected 
by a thread, hop on two polygonal chains and each can hop from a vertex to the next or wait, but can never hop back. 
Then, the discrete Frechet distance is the minimum length thread for the two frogs to reach the ends of their respective 
chains. When we add a lot of points (vertices) evenly on two polygonal chains, the discrete Frechet distance gives a 
natural approximation for the (continuous) Frechet distance. The discrete Frechet distance is more suitable for some 
applications, like protein structure alignment II17[I30L in which case each vertex represents the a-carbon atom of an 
amino acid. In this case, using the (continuous) Frechet distance would produce some result which is not biologically 
meaningful. In this paper, we focus on the discrete Frechet distance. 

It takes 0{mn) time to compute the discrete Frechet distance using a standard dynamic programming tech¬ 
nique ifldll . Recently, this bound was slightly improved 13. Most of the important applications regarding the discrete 
Frechet distance are biology-related Ill51ll7l]28ll30ll . Some of the other applications using the discrete Frechet distance 
just study the corresponding problem using the (continuous) Frechet distance. For instance, given a polygonal curve 
P and set of points S, Maheshwari et al. studied the problem of computing a polygonal curve through S which has 
a minimum Frechet distance to P |[25l . The corresponding problem using the discrete Frechet distance is studied 
in 1291. 

It is worth mentioning that, symmetric to the Frechet distance with shortcuts |[T3 . the discrete Frechet distance 
with shortcuts was also studied by Avraham et al. recently 13- A novel technique, based on distance selection, was 
designed to compute the discrete Frechet distance with shortcuts efficiently. In Section 4, we will also use the discrete 
Frechet distance with shortcuts to compute the corresponding upper bounds for imprecise input. 
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The computational geometry with imprecise objects has drawn much interest to researchers since a few years ago. 
There are two models: one is the continuous model, where a precise point is selected from an erroneous region (say 
a disk, or rectangle) GSll ; the other is the discrete or color-spanning model, where a precise point is selected from 
several discrete objects with the same color and all colors must be selected IT]. We will mainly focus on the continuous 
model, but will also touch the color-spanning model. (There is another probabilistic model, which is not relevant to 
this paper. Hence, we will skip that one.) A lot of algorithms have been designed to handle imprecise geometric 
problems on both models. For the continuous model, there are algorithms to handle imprecise data for computing the 
Hausdorff distance ET], Voronoi diagram lT7l . planar convex hulls II181I23II and Delaunay triangulations II201I22II . 

Ahn et al. studied the problem of computing the discrete Frechet distance between two imprecise point sequences, 
and gave an efficient algorithm for computing the lower bound (of the distance) and efficient approximation algorithms 
for the corresponding upper bound (under a realistic assumption) ||3]lll- It is unknown whether computing the discrete 
Frechet distance upper bound for imprecise input is polynomially solvable or not, so Ahn et al. left that as an open 
problem ESI- In this paper, we proved that the problem is in fact NP-hard. We also consider the same problem under 
the discrete Frechet distance with shortcuts and give efficient polynomial-time solutions. 

The paper is organized as follows. In Section 2, we give the necessary definitions. In Section 3, we prove that the 
discrete Frechet distance upper bound for imprecise input is NP-hard, which is separated into several subsections due 
to the difficulty. In Section 4, we consider the problem of computing the discrete Frechet distance upper bound for 
imprecise input. In Section 5, we conclude the paper. 

2 Preliminaries 

Throughout this paper, we use d(a, b) for the Euclidean distance between points a and b, possibly in M^, where k is 
any positive integer. 

We first define fhe discrete Frechet distance as follows. Let A = (oi..., a^) and B = ( 6 i,..., bm) be two 
sequences of points of size n and m respectively, in M^. The discrete Frechet distance d^p{A, B) between A and B 
is defined using the following graph. Given a distance (5 > 0 and consider the Cartesian product A x B as the vertex 
set of a directed graph Gs whose edge set is 


Es ={{{ai,bj),{ai+i,bj)) \ d{ai,bj),d{ai+i,bj) < <5} U 
ai,bj+i)) I d{ai,bj),d{ai,bj+i) < (5} U 
{[{ai,bj),{ai+i,bj+i)) \ d{ai,bj),d{ai+i,bj+i) <5}. 

Then, B) is the smallest (5 > 0 for which {an, bm) can be reached from (oi, 6 i) in the graph Gs- 

Definition For a region qi, a precise point is called a realization of Qi if a* G qp, For a region sequences Q = 
{qi, q 2 ,..., qn), the precise point sequence A = (oi, 02 , ■■■,an) is called a realization of Q if we have € qi for all 
1 < i < n. 

For the discrete Frechet distance of imprecise input, we use the same notions such that the realization of an 
imprecise input sequence as in Q. To be consistent with these notations, we also use F{A, B) to denote the discrete 
Frechet distance between A and B (i.e., F{A, B) = ddpi-A, B)). 

Definition For two region sequences Q = {qi,q 2 , ...,qn) and H = [hi,h 2 , A = (oi, 02 ,..., an) (resp. 

B = ( 61 , 62 ) fem)) is a possible realization of H (resp. Q) if we have Oi € qi, bj € hj for all 1 < i < n, 1 < j < m. 
The Frechet distance upper bound F^‘^{Q, H) = max{F(A, B)}, where A (resp. B) is a possible realization of Q 
(resp. H). 

We comment that for region (or imprecise vertex) sequences, to obtain decent algorithmic bounds, we mainly 
focus on the regions as balls (disks in 2d) in Section 4. (Though with some extra twist, it might be possible to handle 
square or rectangular regions as well.) But in the proof of NP-hardness, the imprecise regions are rectangles in Section 
3. 

We show in the next section that computing H) is NP-hai‘d, which was an open problem posed by Ahn 

et al. in ||3l. 
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Figure 1: Illustration of the constructed directed colored graph G from 3SAT. We also use different shapes for different 
clause vertices. 

3 Computing the discrete Frechet distance upper bound of imprecise input is NP- 
hard 

In this section, we prove that deciding H) < e is NP-hard. In fact, this holds even when H is a. precise 

vertex sequence, and Q is an imprecise vertex sequence (where each vertex is modeled as a rectangle, not necessarily 
axis aligned). As the proof is quite complex, we separate it in several parts. 

3.1 NP-hardness of an induced subgraph connectivity problem of colored sets 

Firstly, we prove that another induced subgraph connectivity problem of colored sets is NP-hard, which is useful 
for the proof of deciding H) < e. We define the induced subgraph connectivity problem of colored sets 

(ISCPCS) as follows: let G be the graph with n vertices and each vertex is colored by one of the m colors in the 
plane, a fixed source vertex s, a fixed desfinafion verfex t, and some direcfed edges befween fhe vertices (where no 
fwo edges cross), choose an induced subgraph Gg consisting of exacfly one verfex of each color such fhaf in Gs fhere 
is no pafh from s fo t. For an example, see Figure [T] We prove fhaf fhe ISCPCS problem is NP-hard by a reducfion 
from 3SAT. 

Lemma 3.1 ISCPCS is NP-hard. 

The defailed proof is in fhe appendix, an example is given in Figure [U 

3.2 The free space diagram 

The free space diagram of fhe discrefe Frechef disfance befween a realization of Q, iF is composed of a grid of n x m 
cells, where n and m are fhe number of vertices in Q and H respecfively. We firsl consider fhe case when bofh 
Q,H are precise. In fhis case, lef q* and hj denote the f-th and j-th vertex of Q,H respectively. Each pair (q,, hj) 
corresponds to the cell in the f-th row and the y-th column. From the definition of the discrete Frechet distance, it 
corresponds to a monotone path in the grid from cell (1,1) to (n, m). In the sequel, for the ease of description, we 
sometimes loosely call such a path “a monotone path”. We cover the details regarding such a path next. 

Cell G[i,j] = (ffi, hj) is painted white if d{qi, hj) < e, which indicates that this cell can be passed by a potential 
monotone path. Cell G[i,j] = {Qi^hj) is painted gray if d{qi,hj) > e, which indicates that this grid cannot be 
passed by any monotone path. Each cell G[i,j] could reach its monotone neighboring cell C[i,j -f- f],C[i -|- 1, j] or 
C[i -\- l,j + f] if both of them are painted white. The discrete Erechet distance is the minimum e such that there is a 
path from cell (1,1) to (n, m) and the path is monotone in both horizontal and vertical directions. See Eigure|2](a) for 
an example. 
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Figure 2: Illustration of the free space diagram of discrete Frechet distance with precise input (a); and, the free space 
diagram of discrete Frechet distance with imprecise input (b). 
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Figure 3: Illustration of the variable gadget and clause gadget. 


Now, we consider the free space diagram when H = {hi, h 2 ,hm) is a precise vertex sequence and Q = 
(qi, 52 ) •••, <?n) is an imprecise region sequence. There are several cases below. 

• (1) If d{q, hj) < e, Vq € q*, then the cell C[i,j] is painted white and could be passed. 

• (2) If d{q, hj) > e, Vg € q*, then the cell C[i,j] is painted gray and cannot be passed. 

• (3-a) There are two vertices hi,hj and an imprecise vertex satisfying either d{q,hi) < e or d{q,hj) < 

e, Vq G 5 fc, see Figure[3ta). Then, we paint the cell C[k, i], C[k,j] with the same color, which show that either 

C[k, i] or C[k,j] can be passed, see Figure[2](b). This case will be designed as a variable gadget. 

• (3-b) There are three vertices hi, hj, hk and an imprecise vertex q^ satisfying 

d{q,hi) < e, ov d{q,hj) < eor d{q,hk) < e,Vg G qx, see Figure |3lb). Then we paint the cell Cfa;, i], 
and C[x,k] with the same color. This case can be designed as a clause gadget. Of course, it is possible that 
more than one of the cells C[x, i],C[x,j],C[x, k] might be passed at the same time. But our objective is to 
make the discrete Frechet distance as large as possible when only one of them is passed. 

In fact, there could be more complicated cases than the three cases above, but we do not need them in our 
construction. 

3.3 The grid graph for the color-spanning set 

The free space diagram of the discrete Frechet distance is really a directed grid graph. Now we show how to convert 
the ISCPCS instance, e.g., in Figure [H into a grid graph. The basic steps are as follows: the grid has n + m + 3 rows 
and 3m + 2n columns, and the colored cells in the grid correspond to the colored vertices in ISCPCS. The details are 
step by step as follows. 

1. For the first row (from bottom up), all the cells are painted white, which means that all cells can be passed. 
The motivation is to make the starting cell (the lower-left cell C[l, 1], which corresponds to the start node s 
in ISCPCS) in the grid graph reachable to all the colored clause cells (which correspond to clause vertices in 
ISCPCS). 
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Figure 4: Illustration of the equivalence relation between the free space diagram (grid graph) and ISCPCS in Figure[T] 
The horizontal coordinates denote the precise points, while the vertical coordinates denote the imprecise points. A 
white cell means it can be passed, while the gray cell means they could not be passed, and the cells painted by the 
same color (and with the same number) means any one of them can be passed. 

2. From the 2nd row to the {m + l)-th row (m is the number of clauses in the 3SAT instance from which the 
ISCPCS instance is constructed), each row has three cells with the same color. We call them clause cells, 
corresponding to the three clause vertices of the same color in ISCPCS. Each column has at most one clause 
cell. If there is a clause cell C{k,j) in the j-th column, then the cells C[i,j], 2 < i < n + l,i ^ k, are painted 
white and could be passed. If there is no clause cell in the j-th column, then cells C[i,j], 2 < i < n + 1, are 
painted by gray and could not be passed. 

3. We do not put any clause cell in the (m + 2)-th row. If there exists a clause cell C[k, j], 2 < /c < n + 1, in the 
j-th column, then C[m + 2, j] is painted white and can be passed; otherwise, the cell C[m -|- 2, j] is painted 
gray and could not be passed. 

4. From the (m -I- 3)-th row to the {n + m + 2)-th row (n is the number of variables in the 3SAT instance from 
which the ISCPCS instance is constructed), each row has two cells with the same color. We call them variable 
cells, which correspond to two variable vertices in the ISCPCS instance. (For an example, see the cells painted 
with number 4 in Figure |4]) Each column has at most one variable cell. If there is a variable cell C[k,j] in the 
j-th column, then the cells C[i,j],m + 3 < i < n + m + 2,i ^ k, are painted white and could be passed. If 
there is no variable cell in the j-th column, then cells C[i,j],m + 3 < i < n + m + 2, i ^ k, are painted by 
gray and could not be passed. 

5. Eor the last row (from bottom up), all the cells are painted white, which means that all cells can be passed. The 
motivation is to make sure that all the variable cells can connect to the final cell (the upper-right cell) in the grid 
graph, which corresponds to the destination node t in ISCPCS. 

6 . There are a total of (3m -|- 2n) columns in the grid graph. If there are k clause vertices connecting to a fixed 
variable vertex in ISCPCS, then there are k clause cells connecting to a variable cell (say, C[i, j]). The k clause 
cells are located from the (j — k)-th column to the (j — l)-th column, and the order of these columns are 
adjusted to make those k clause cells arranged from lower-left to upper-right. (Eor an example, see Eigure|4]) 
This unique design can ensure that any monotone path from C[l, 1] to C[n + m + 3, 3m + 2n] has to pass one 
clause cell and one variable cell connect to it. 


3.4 Realizing the grid graph geometrically 

To complete the proof that deciding H) < e is NP-hard, we need to construct a precise vertex sequence 

H = (/ii, /i2, ..., /i3m+2n) and an imprecise vertex sequence Q = (qi, ^2) <73) Qn-i-m+a) (where each imprecise 
vertex is modeled as a rectangle) such that the free space grid graph constructed above can be geometrically realized. 

Throughout the remaining parts, let C{a, r) (resp. D{a, r)) be a Euclidean circle (resp. disk) centered at a and 
with radius r. The rectangles used to model imprecise points do not need to be along the same direction. The general 
idea of realizing the grid graph geometrically is as follows. 
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Figure 5: Illustration of the general idea of realizing the grid graph geometrically. 

1. For the 2nd to the (m + l)-th rows of the grid graph, we design the points Pi, Mi, i = 0,1, 2, ...,m — 1, which 

satisfy d{Pi,Mi) = e and d{Pi,Mj) < e — 2e' < e (e' <C e) when i ^ j. Each clause gadget is composed 
of three precise vertices hi^,hi^,hi^ is the index of sequence H) and an imprecise vertex qi as in 

Figure[3tb). hi^, hi^, hi^ are located inside D{Pi- 2 P'), Qi is located inside D{Mi- 2 p') when (2 < i < m +1). 
All the points Pi,i = 0,1,2, m — 1 are located in a small region with diameter less than e/10. 

2. For the (m + 3)-th to the (n + m + 3)-th rows, we design the points P/, M|, i = 0,1, 2,..., n — 1, which satisfy 

d{P',M-) = e and d{P',Mj) < e — 2e' < e when i ^ j. Each variable gadget is composed of two precise 
vertices hi^ and an imprecise vertex qi as in Eigure|3la). hi^, hi^ are located inside D{P'_,^_^, e'), qi is 
located inside e') when m + 3<i<n + m + 2. Again, all the points P/, z = 0,1,2,..., n — 1 are 

located in another small region with diameter less than e/10. 

3. d{Pi,Mj) > 3e/2 > e,d{P[,Mj) > 3e/2 > e. d{p,q) < e when p € D{Pi,e'),q G D{Pj,e'),i ^ j, and 
d{p-, q) < e when p G D{P', e'), q G P(Pj, d), and i ^ j. 

4. Eor the first and last row, the first imprecise vertex qi and last imprecise vertex are located in a region 

P((0, —3e/4), e') which is fully covered by any circle C{p, e) where p G (J D{Pi, e') (z = 0,1,..., m — 1) and 
circle C{p, e) where p G IJ D{Pl, e')(z = 0,1,..., n — 1). 

5. Eor the (m + 2)-th row, the vertex qm +2 is located inide a region P((0,0),e'), which is fully covered by 
the circle C{p, e) where p G |J D{Pi,e') (z = 0,1, ...,m — 1) but not covered by any circle C{p, e) where 
F e U D{Pl, e'){i = 0,1,..., n - 1). 

Due to space constraints, the details for realizing the grid graph are given in the appendix (Section 7.2). 

Theorem 3.2 Computing the upper bound of the discrete Frechet-distance with imprecise input is NP-hard. 

Proof Erom our construction and Eemma 1, H) > e if and only if there exist a choice that choose exactly 

one passable cell of each color such that there is no monotone path from lower-left cell to the upper-right cell in the 
equivalent free space grid graph, which holds on if and only there exist an induced subgraph Gs consist of exactly 
one vertex of each color in equivalent colored graph G such that in Gg there is no monotone path from s to t, which 
in turn is true if and only if the corresponding 3SAT instance is satisfiable. The total reduction time is 0{{m -|- n)^), 
and the theorem is proven. | 
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4 The discrete Frechet distance with shortcuts for imprecise input 


As covered in the introduction, the discrete Frechet distance is sensitive to local errors; hence, in practice, it makes 
sense to use the discrete Frechet distance with shortcuts ||8l. This is defined as follows. (We comment that this idea 
of taking shortcuts was used as early as in 2008 for simplifying protein backbones 10.) 

Definition One-sided discrete Frechet distance with shortcuts: For two point sequences A = (oi, 02,03, ..., an), and 
B = {bi, b 2 , 63, ..., bm), let B) denote the discrete Frechet distance with shortcuts on side B, where 
Fc{A, B) = min{F(A, B')} and B' is a non-empty subsequence of B. 

Alternatively, we can define the discrete Frechet distance with shortcuts on side B as follows. We loosely call 
each edge appearing in the set Es (in Def. 1) a match. Given a match (a*, hj), the next match (a^, bi) needs to satisfy 
one of the three conditions: 

a) k = i + 1,1 = j', 

b) k = i,l > j\ 

c) k = i + 1,1 > j. 

In HI, Avraham et al. gave a definition of discrete Frechet distance with shortcuts, they assumed no simultaneous 
jumps on both sides (i.e., case c) does not occur), though they claimed that their algorithm can be easily extended to 
this case when simultaneous jumps are allowed. 

Now we define fhe discrefe Frechef distance with shortcuts for imprecise data as follows: 

Definition VF): For two region sequences U = {ui,U 2 , ■■■,Un) and kF = {wi,W 2 , the upper- 

bound of the discrete Frechet distance with shortcuts on side VF is defined as W) = max{Fc(A, B)}, where 

A = (ai, 02 ,..., On) (resp. B = {bi,b 2 , ■■■,bm) is a possible realization of U (resp. IF) satisfying a* € Ui and 

bj G Wj. 

4.1 Computing W) when one sequence is imprecise 

At first, we consider the case when U is a precise vertex sequence composed of n precise points in and VF is an 
imprecise vertex sequence, where each of the m imprecise points is modeled as a ball in R'^. 

Let u[ denote the ball centered at Ui with radius (5, i.e., u' = D{ui, (5). Let M{i,j) denote the match or matching 
pair between Ui and Wj. 

For the discrete Frechet distance with shortcuts on side VF, we only need to consider the jump from M{i,j) to 
M{i + l,k) (k > j), and there is no need to consider the jump from M{i,j) to M{i, l){l > j) . This is due to that 
the match M{i, 1) will jump to M{i + 1, 1') ((' > 1) finally when i < n, and we can jump directly from M{i,j) to 
M{i + 1, V) without passing through M{i, 1). 

The algorithm to decide VF) < 5 is as follows. 

Stai-ting from the starting matching pair M(l, j*(l)) to the ending matching pair M{n,j*{n)) if possible, where 
j*{l) is the smallest k {1 <k <m) which satisfies that Wk F u[, j*{i) be the index of sequence VF computed by the 
decision procedure W) < 6 below for each fixed i, lef S denote the set of those matches (or matching pairs) 

_ 

\.i = l,j = l. 

2. While {i < n) 

Find a smallest k (j < k < m) which satisfies fhat 
If k exists, let j*{i) = k, add the match M{i,j*{i)) to S, 
and update j = k, i = i + 1. 

Else return VF) > 5. 

Return Ff “([/, VF) < 5. 


Fig.7 The decision procedure for F'^^^{U, VF) < 6, where (7 is a precise sequence and VF is an imprecise sequence. 

We will show that the above procedure correctly decides whether Ff^^^{U, W) < 6. 

Lemma 4.1 There exists a realization ofW to make j*{i) be the smallest index of sequence VF such that M{i,j*{i)) 
is reachable by lump from M{l,j*{l)) for each fixed i. That means there is no monotone increasing path from 
M{l,j*{l)) to M{i,j) when l<j< 
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Proof We prove this lemma by an induction on i. 

(1) Basis: When i = 1, then there exists a realization ( 6 i, 62 , 6 j*(i)_i) of {wi,W 2 , ■■■, tUj*(i)_i) respectively 

which satisfies d{ui, bj) > (i for 1 < j < Hence, there exists a realization of W which make the matching 

M(l, y) is not reachable when 1 <j <r(i)- 

(2) Inductive hypothesis: We assume that there exists a realization ( 61 , 62 , of 

which makes M{k, j) (1 < j < j*{k)) not reachable when i = k. 

(3) Inductive step: We consider the case when i = k+l. For M (fc+1, j), j* (k) < j < j* (k+l) not in S then there 

exists a realization of which satisfies d{uk+i,bj) > 6, for j*{k) <j< 

j*{k + 1). Based on fhe inducfive hypofhesis, fhere exisfs a realization ( 61 , 62 ,which makes M{k,j), 
1 < i < j*{k), nof reachable. By combining fhe fwo parts, fhere exisfs a realization ( 61 , 62 ) ■■■■, bj*{k+i)-i) 

of (mi, u; 2 ,which makes M(A; + l,y), where 1 <j < j*(A: + 1), nof reachable. | 

Lemma 4.2 In given a precise vertex sequence U with size n and an imprecise vertex sequence W with size m 
(each modeled as a d-ball), whether W) < 6 can be determined in 0{d{n + m)) time and space. 

Proof If fhe above decision procedure refums W) > 5” when i = k + 1, fhen fhere exisfs a realizafion 

of W fo make M{k + 1, j) nof reachable, for /c + 1 < n and j*{k) < j < m, based on Lemma ILTI Thai means 
F^^^iU, W) > 6. 

If fhe above decision procedure relurns W) < 5”, fhen fhere exisfs a monotone palh from M(l, j*(l)) 

fo M(n,j*(n)) for any realization of VF. The reason is fhaf, for any f and j*(z), M(z,j*(z)) € 5 implies C n', 

which means W) < 5. The correcfness is hence proven. 

As for fhe running lime, checking whelher Wj C u[ lakes 0{d) lime. The decision procedure incremenlally lesfs 
on a row- and column-monotone pafh. Therefore, if runs in 0{d{m + n)) lime and space. | 

Lei 6ij = d(ui,Cj) + rj where cj and rj are fhe cenler and radius of Wj respeclively. For fhe opfimizalion 
problem, fhere are a lolal of 0{mn) evenfs when 6 increases continuously. Here, an evenl wj C u[ occurs when 6 
increases to 6ij. Therefore, we can solve fhe optimization problem of computing F^‘^{U,W) in 0{dmn\ogmn) 
lime by sorfing (5jj ’s and performing a binary search. We show how fo improve Ihis bound below. We firs! consider 
fhe planar case in fhe nexl Iheorem. 

Theorem 4.3 In R^, given a precise vertex sequence U with size n and an imprecise vertex sequence W with size m, 
all modeled as disks in R^ with an equal radius, F^^^(U, W) can be computed in 0({vn?^^'n?‘/^-\-m+n) log^(m-l-re)) 
time. 


Proof The evenl wj C u[ occurs when 6 = Sij. We do nof need to sorl fhe 0{mn) dislances; instead, we can use fhe 
disfance seleclion algorilhm in ifT^ as follows. One can selecf fhe fc-fh smallesf pairwise disfance in A x B, where 
A and B are fwo precise vertex sequences in fhe plane and |A| = n,\B\ = m. The running lime of Ihis disfance 
selection algorilhm is -|- m -|-n) log^(m -|- n)) |[T^ . By combining Ihis disfance selection algorilhm and 

fhe binary search, we can compute F^^^{U, W) in 0{(m + n) log(mn) -|- + log^(m-l-n) log(mn)) 

= + m + n) log^(m -|- n)) time. | 

Unfortunately, the distance selection algorithm could not be extended to high dimensional space. Hence, in a 
dimension higher than two, we use a dynamic programming method to compute W). 

Let (resp. VF(i,y)) denote the partial sequence = (ui, ttj+i,..., ttj) (resp. W{i,j) = {wi,Wi^i, ...,Wj 

Fmax^i^j) (denotes the upper bound of the discrete Frechet distance with shortcuts on side VF(l,j) for sequences 
U{l,i) and iy(l,j), and j) denotes the upper bound of the discrete Frechet distance with shortcuts on side 

VF(1, j) between U{l,i) and W{l,j) on the condition that wj is retained (not cut), that means Wj match Ui, namely 
= max{F™®'^(i — 1, j), dij}. While in F^^^{i,j), Ui may do not match Wj as Wj may be cut. 

Then we have the recurrence relations as follows. 


7 max/. , T (f, j), J}, i > 0 

ij-ii-f mititervi.Azrvi.i + i)). i>o 

+ zrm.j + n i = o 

We need to run these two recurrence relations alternatively, e.g. computing *) ((i, *) denotes {(i, j), 1 < 

j < m}), then *), then -t- 1, *), etc. It is easy to see that this dynamic programming algorithm takes 


0{mn) time and space after all the distances Sij are calculated in 0{dmn) time and space. On the other hand, the 
space complexity can be improved as we only need to store a constant number of columns of values and compute 6ij 
when needed. Hence we have the following theorem. 

Theorem 4.4 In given a precise vertex sequences U of size n and an imprecise vertex sequences W of size m 
(each modeled as a d-ball), W) can be computed in 0{dmn) time and 0{d{m + re)) space. 

4.2 Computing W) when both sequences are imprecise 

In this subsection, we consider the problem of computing when both U = {ui,U 2 , and W = 

{wi,W2, ...,Wm) are imprecise sequences, where each vertex is modeled as a disk in M^. (Our algorithm works in M'^, 
but as it involves Voronoi diagram in M'^, the high cost makes it impractical.) 

For two imprecise sequences U, W, and a precise point p, the maximal distance between a precise point p and a 
region is defined as w^) = max{d(p, q),q £ w^}. Let Dmin(p, j, k) = m.uij<:c<k{Draa.^{p, Wx)} denote 

the minimal distance between a point p and several regions {wj, ryj+i,..., Wk}. 

We define D{i,j,k) = ma'K{D^ia{p,j,k),p £ Ui}. In this subsection, we compute the j*{i) by using the 
decision procedure below: 

l.i = l,j = l. 

2. While (i < re) 

Find a smallest k (j < k < m) which satisfies D{i,j, k) < 6. 

If k exists, let j*(i) = k, add to S, 

and update j = k, i = i + 1. 

Else return W) > 5. 

Return F^^^{U, W) < S. 


Fig. 8 The decision procedure for W) < 5 when both U and W are imprecise sequences. 


Lemma 4.5 Given two imprecise vertex sequences U and W with sizes \U\ = re and \W\ = m, each vertex modeled 
as a disk in M^), W) < 5 can be determined in 0(rre^ + re) time and 0{m + re) space. 

Proof The correctness is given as follows. 

(1) If the decision procedure returns W) > 5”, then there exists a realization of U and W which makes 

it impossible to reach the last matching pair M(n,j*{n)). The argument is similar to Lemma Iddl and omitted here. 
That means W) > 6. 

(2) If the decision procedure returns VF) < 5”, then S has re elements, i.e., S = {M(l,j*(l)), 

M{2, j*(2)), M(3, j*(2)),...,M(re, j*(re))}, where j*{i) < j*{i + 1). We claim that there exists monotone matching 
pair set {M(i, j(i))|l < f < re} {j{i) < j*{i)) under any realization of U and W, where j(i) is an index of W and Ui 
match and j{i) < + !)• 

We prove the claim by an induction on i. 

(2.1) Basis: When z = 1, if all the matching pairs M(l,j), j < j*(l), are not reachable, then there exists 
a realization bx,l < x < j*(l), and oi which satisfy d(ai,bx) > 6. Then i2(l, 1, j*(l)) > 6, and we have a 
contradiction, that means there exist j(l) < j*(l) such that M(l, j(l)) is possible under any realization. 

(2.2) Inductive hypothesis: We assume that the claim holds when i = 1. 

(2.3) Inductive step: Now we consider the case when i = I + 1. By the inductive hypothesis, there exists 

monotone matching set M(l, j(l)), M(2, j(2)),..., M{l,j{l)), j{i) < j*{i) under any realization of (7(1, /), iy(l, j). 
As D{1 + 1, j, j*(f + 1)) < D{1 + 1, j*(f), j*(f + 1)) < 5, there exists a matching pair M{1 + l,j{l + 1)), where 
3 (^ + 1) < 3* {I + l)^ which is reachable by jumping directly from M(l,j(1)) under any realization of (7(( + 1, ( + 1) 
and W{j + 1, j(( + 1)). Hence, if the decision procedure returns Q) < 6”, then Q) < 6. 

We now compute the time it takes to find a smallest k (j < k < m) satisfying D(i,j,k) < 6. The steps to 
compute k) can be done as follows. 

(I) We compute the inverted additive Voronoi Diagram ll2T]| (iaVD) of imprecise vertices zuj, mj+i, ...,Wk modeled 
as disks (may have different sizes), which takes 0{(k — j) log{k — j)) time. 

(II) If the imprecise region Ui intersects the boundary of iaVD, then some vertex of the partial boundary within 

Ui would be the realization of Ui in computing k). Otherwise, the imprecise region Ui is located in the cell 


9 



controlled by some site Wx- Then, the diameter of the region Ui U Wx would he. D{i,j,k). This step takes 0{k — i) 
time. 

As we need to construct the inverted additive Voronoi Diagram incrementally, each single insertion takes 0{s) 
time, where s is the size of the iaVD. Hence the total time to find a smallest k {j < k < m) is Ylj<x<ki^ ~ i) ~ 
0{{k — j)^). Therefore, the total time complexity is Yli<i<nU* + 1) “ = 0{'m? + n). | 


For the optimization problem, we again use a dynamic programming algorithm to solve it. The algorithm is similar 
to that in Theorem l4.4l and the difference is to use D{i,j, k) instead of 6ij. The recurrence relation is as follows. 


pmax/• ,1 j mini<j<fc{max{Ff'"^(z, j), D{i + 1, j, k)}}, i > 0 
+ i = 0 

It seems that the dynamic programming algorithm takes 0{nm?) time after all the distances D{i,j, k) are calcu¬ 
lated in 0{nm^) time. However, we can use the Monge property to speed up the computation of dynamic program¬ 
ming, we only need to compute 0{nm) distances D{i,j, A;)’s in 0{nm?) time. 

(1) j) is a monotone decreasing function when j increases for a fixed i. 

(2) D{i + l,j, k) is a monotone increasing function when j increases for fixed i and k. 

(3) Let y’fc denote the index satisfying -|- 1, A:) = max{F™'^^(z, j^), i2(ii -|- l,jk, k)}, and denote the 

index satisfying + l,k + l) = jk+i), D{i + l,jk+i,k + 1)}, then jk+i > jk- 

Hence we only need to try distances D{i + l,jk, k + l),D{i + 1, + l,k + l),D{i + 1, -|- 2, A: + 1),..., D{i + 

1, jfc+i, A:-|-l), + jfc+i -1-1, A:4-1) when computing F™^^(z-|-1, A;4-1) for a fixed i and k, namely {jk+i —jk + ‘^) 

distances, hence the total number of distance is 0{m) for a fixed i. 

Hence -|-l,A:)(l<A:<m) can be calculated in 0(m) time after the distances D(i 4-1, j, A:) (1 < A: < m) 

are calculated for a fixed i. We then only need to try 0{m) distances D{i 4- 1, j, k) {1 < k < m): the update of 
iaVD needs at most 0{m) insert operations, 0{m) deletion operations, and 0{m) query operations, each takes at 
most 0{m) time. Hence the total time is 0{m?) for a fixed i. Hence we have the theorem below. 


Theorem 4.6 In given two imprecise sequences U and W of size n and m respectively, where each imprecise 
vertex is modeled as a disk, W) can be computed in 0{nm?‘) time. 

We comment that when both U and W are imprecise, our algorithm could still work in But due to the high 
cost (like constructing the d-dimensional Voronoi diagram), the algorithm then becomes impractical. Hence, we only 
focus on the problem in for this case. 


5 Concluding remarks 

In this paper, we consider the problem of computing the discrete Frechet distance of imprecise input. We address 
the open problem posed by Ahn |l3l|4l et al. a few years ago, and show that the discrete Frechet distance upper 
bound problem of imprecise data is NP-hard. And our NP-hardness proof is quite complicate, the construction has a 
combinatorial and a geometric part. In the combinatorial part, we interpret the imprecise discrete distance in terms of 
finding monotone paths through a colored grid graph; In the geometric part, we show that the relevant colored free 
space diagram grids can be realized geometrically. Given two imprecise vertex sequence U, W (each vertex modeled 
as a d-dimensional ball), we show that the upper bound of the discrete Frechet distance between U and W can be 
computed in polynomial time if allowing shortcuts on one side. It would be interesting to consider these problems 
under the continuous Frechet distance. 
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7 Appendix 


7.1 Proof of Lemma 1 

Proof Let (/>be a Boolean formula in conjunctive normal form with n variables xi, X 2 , ■ ■ ■ ,Xn and m clauses Ci, C 2 , 

..., Cm, each of size at most three. We take the following steps to construct an instance G of ISCPCS. 

For each Boolean variable x*, ^ in </>, we use two vertices with the same color (and different variables always use 
different colors), one denoted as Xi, the other denoted as xi. See xi, X 2 , and X 3 in Figure [T]for example. Eventually, 
we have to pick one of the two vertices to retain this color. One represents that the variable Xi is assigned the True 
value, and the other corresponds to the value False. 

For each clause Ci in (/>, we construct three vertices with the same color (which has never used before). (We use 
shapes instead of colors in Figured] to emphasize the difference with variables; for example, we use three triangular 
vertices in Figured) to denote clause (xl V X 2 V ^)). We then add directed edges between the clause vertices and 
variable vertices, the rule is as follows: let vertex pi be the vertex used to denote variable Ixl (resp. Xj), and vertex Cjj- 
be the vertex used to denote the clause Cj which contains Xj (resp. Tl), then we add a directed edge from Qj to pi, 
see Figured) 

At last we add edges from the source s to each vertex denoting a clause, and add edges from each vertex denoting 
a variable to the destination t. It is easy to ensure that there is no crossing between the edges: all the vertices denoting 
variables are arranged from left to right, there is a clause vertices for each literal (variable vertex), and each of the 
clause vertices connecting to a fixed variable vertex Xi is just below the variable vertices Xj. Let the resulting directed 
geometric graph be D. We next complete the proof by proving that (f) is satisfiable iff there exists an induced subgraph 
Gs such that in Gg there is no path from s to t. 

If 0 is satisfiable with some truth assignment, then there is at least one true literal in each clause Ci. We show 
how to compute the induced subgraph Gg from G as follows. For each pair of variable vertices representing {xj, xj}, 
we pick one which is assigned True. Let Ci = u\/ v \/ w, where u, v, w are literals in the form of Xj or licj. Then we 
have three clause vetices Cu,i, Cy^i and c^^i, of the same color colovi, representing the clause Cj. WLOG, just suppose 
that u is a true literal in Ci (pick any one true literal if there exist more than one literal in Ci be true), and we choose 
the vertex Cu^i to cover the color colori. By construction, in D, there is no edge from the vertex Cy^i to the vertex pu 
representing u. Hence, there is no path from s to f crossing the clause vertex Cu,i (representing clause Ci). As this 
holds for all clauses and any path from s to f has to pass a clause vertex, hence there exists an induced subgraph Gg 
consist of exactly one vertex of each color with no path from s to t. 

‘t—’ If there exists an induced subgraph Gg consist of exactly one vertex of each color with no path from s to t, 
we need to prove that (j) is satisfiable. Suppose to the contrary that cj) is not satisfiable, then at least one clause is not 
satisfiable. Lef this clause be Ci = u V v V w, and let the clause vertices Cu,i, Cy^i and Cy,^i connect to the variable 
vertices pu, py and in D which correspond to the variables u, v, w respectively. As u, v and w are all false, pu, 
Py, Py, are all picked in the induced subgraph. Then, there exists a path from s to f passing thr'ough Cy^i, Cy^i or Cy,^i, 
as one of them must be picked. A contradiction! 

Hence, cj) is satisfiable if and only if fhere exists an induced subgraph Gg consist of exactly one vertex of each 
color with no path from s to t. The reduction obviously takes 0{n + m) time. | 

7.2 Details for realizing the grid graph geometrically 

The details for realizing the grid graph are given here. Lirst, we create the points used for determining the position of 
the vertices in H and Q. Let 6 be satisfying that max{m, n} * 6 < 7 r/ 20 . Let N = max{m, n}, and WLOG, let N 
be even. We construct a circle C{0, r), where O = (0, e/2) and r = e/2. 

We construct a sequence of points Pi (i = 0,1,2,..., iV) on the lower half of circle C{0,r) in counterclockwise 
order, and the point Pn /2 overlaps with point (0,0), see Ligure| 6 ] Note that the distance between two adjacent points 
Pi and Pi+i is L = 2|sin| = esin|, and ZPjOPj+i = 9, for i = 0,1,2,...,A^ — 1. Hence, all the points 
Pi, f = 0,1, 2,..., N are within a region of diameter less than ^ • f < e/10. (We comment that in Ligure)^ these 
points are spread out much more than they should be, as we need the space for putting the labels.) 

We then construct a sequence of points Mi {i = 0,1,2,..., N) on the upper half of circle C{0, r) in counterclock¬ 
wise order. Each line PiM, crosses the center of C{0,r)\ namely Mi is the symmetry of the point Pi about point 
O. 

It is obvious that d{Mi,Pi) = e and d{Mi,Pj) < e,i ^ j. Recall that D{Pi,e') is the neighborhood (disk) 
centered at Pi with radius be e'. Here, we have e' = \ min{e—Mj), i 7 ^ j}; moreover, d{p, q) < (e—2e')+2e' < 
e, for p G D{Pi, e'), q G D{Mj,e'), and i / j. 
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Figure 7: Illustration of the deployment of vertices of Pi’s and Q, corresponding to Figure |4] 


Let P/ be the symmetry of the point Pi along the horizontal line y = —SejA. Let M/ be the symmetry of the point 
Mi along the horizontal line y = — 3e/4. We finish the steps of our construction in order as follows. 

1. The imprecise vertices {q 2 , gm+i} used to construct the clause gadget are deployed in the upper half of the 
circle C'((0, |), |). An example is given in Figure |7] (see q 2 , 93 , 94 there). For an imprecise vertex qi,2 < i < 
m + 1, three points hi ^, hi ^, hi^ are located in i 7 (Pj_ 2 , e'), qi is located in P(Mj_ 2 , e'), and the three circles 
C{hi^,e), C{hi 2 ,e),C{hi ^, e) cover qi as in Figure [ 8 ] The clause gadget is constructed as follows. The point 
hi^ overlaps with Pi, hi^ is located to the left of line Pj_ 2 Mi _2 with a distance e'/S to hi^, and hi^ is located 
to the right of line Pj_ 2 Mj _2 with a distance e'/3 to hi^. The points /ijj, hi^, hi^ are located on the same line 
peipendicular to line Pj_ 2 Mi_ 2 . The three intersections between C{hi^,e),C{hi^,e),C{hi^,e) are si,S 2 ,S 3 
from left to right about the horizontal line hi^hi^. Let qi be the rectangle with length 2e'/3 and width e", the 
upper long side of qi crosses si, S 3 and is symmetric along the line Pj_ 2 Mj_ 2 , and the lower long side of q^ 
crosses S 2 . e" < d(s 2 , Mj_ 2 ) = e — \J^ — (e'/S)^ < eV^- Hence d(Mj_ 2 , g) < e' when q e qi. 

The imprecise vertices {qj\2 < j < m + l,j ^ i) are fully covered by C{hij^,e), C{hi.^,e), C{hi^,e) as 
d{p, q) < e, p ^ D{Pi, e'), q € D{Mj, e'),i 7 ^ j. The above design can ensure that, in the corresponding free 
space grid graph, either one of the three cells C[i, ii],C[i, 72 ], C'[i, 73 ] can be passed by a potential monotone 
path, and the cells C[i, ji], C[i, 32 ], , ^ 3 ], j / can also be passed {hj ^, hjj, hj^ and qj, j / are used to 

construct another clause gadget), while the rest of cells in the z-th row cannot be passed. 

2. The imprecise vertices {q^+s, ..., 9 n+m+ 2 } used to construct the variable gadgets are deployed in the lower 
half of another circle C'((0, —2e), |). For an example, see Figure |7J For an imprecise vertex qi,m + 3 < i < 
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Figure 8: Illustration of the precise location of hi ^, hi^, hi^ and qi. 


n + m + 2 , two points hi^ , hi^ are located in D{P'_^_^,e^), is located in D{M[_^_^,e') and the long side 
of qi is parallel to the tangent line at Two circles C{hi^,e),C{hi^^f,) cover qi as in Figure |3la) to 

construct a variable gadget. The other imprecise vertices {qj\m +3 < j < n + m + 2,j ^ i} are fully covered 
by C{hi ^, e), C{hi 2 , e), the precise location is similar to the construction in Figure This design can ensure 
that, in the corresponding free space grid graph, one of the cells C{i, ii),C{i, ^ 2 ) can be passed by a potential 
monotone path, and the cells C{i,ji),C{i,j 2 ),j ^ i can also be passed , hj^ and qj are used to construct 
another variable gadget), but the other cells in the i-th row cannot be passed. 

3. The first imprecise vertex qi and last imprecise vertex q^+m+s are deployed in T*((0, —3e/4),e'), which are 
fully covered by all the circles C{p,e),p G [jD{Pi,e') or p G [J D{Pl,e'), as (^ + ^ + 2e') < e. This 
design can ensure that, in the corresponding free space grid graph, all the cells in the first row and last row can 
be passed by a potential monotone path. 

4. The imprecise vertex Qm +2 is deployed inside region Ti((0,0), e'), which is only fully covered by any circle 
C{p, e),P G U D{Pi, e') . But it is not covered by the circle C{p, e),p G |J D{P^, e'). This design above can 
ensure that, in the corresponding free space grid graph, all the cells C[m + 2, ii],C[m + 2, i 2 \,C[m + 2, za] in 
the m + 2-th row can be passed {hi ^, hi ^, hi^ are used to construct the clause gadget with qi), while the other 
cells in the (m + 2)-th row cannot be passed. 

5. At last, we adjust the order and rename for the vertices in H. If there is a variable cell in the j-th row of the 

free space grid graph, and there are a total of k clause cells connecting to it, say the row number of k cells 
are 1 ', 2 ', , ...,k' respectively, then we choose one point (never be renamed before) from the three points 

/ij^ for each fixed i', and rename those k points as •••, hj-i in order. 
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